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Abstract 



The Maximum Betweenness Centrality problem (MBC) can be denned as follows. 
Given a graph find a fc-element node set C that maximizes the probability of detecting commu- 
nication between a pair of nodes s and t chosen uniformly at random. It is assumed that the 
communication between s and t is realized along a shortest s—t path which is. again, selected 
uniformly at random. The communication is detected if the communication path contains a 
node of C. 

Recently Dolev et al. (2009) showed that MBC is NP-hard and gave a (1— l/e)-approximation 
using a greedy approach. We provide a reduction of MBC to Maximum Coverage that sim- 
plifies the analysis of the algorithm of Dolev et al. considerably. Our reduction allows us to 
obtain a new algorithm with the same approximation ratio for a (generalized) budgeted version 
of MBC. We provide tight examples showing that the analyses of both algorithms are best pos- 
sible. Moreover, we prove that MBC is APX-complete and provide an exact polynomial-time 
algorithm for MBC on tree graphs. 

1 Introduction 

A question that frequently arises in the analysis of complex networks is how central or important a 
given node is. Examples of such complex networks are communication or logistical networks. There 
is a multitude of different measures of centrality known in the literature. Many of these measures 
are based on distances. Consider, for example, the measures used for the center or the median 
location problem. We, in contrast, are interested in centrality measures that aim at monitoring 
communication or traffic. 

We investigate a centrality measure called shortest path betweenness centrality [3]. This 
measure can be motivated by the following scenario that relies only on very basic assumptions. 
Communication occurs between a pair (s, t) of distinct nodes that is selected uniformly at random 
among all node pairs. The communication is always established along a shortest s-t path where each 
such path is chosen with equal probability. The centrality of a node v is defined as the probability 
of detecting the communication, that is, the probability that v lies on the communication path. 

As a possible application we refer to the task of placing a server in a computer network so as to 
maximize the probability of detecting malicious data packets. Another example is the deployment 
of toll monitoring systems in a road network. 
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As suggested by the previous application example, a natural extension of the above scenario 
is to measure the probability of detecting communication for a whole set of nodes. The resulting 
centrality measure is called group betweenness centrality [5j Sj . 

In this paper we investigate the problem of finding a given number k of nodes such that the group 
betweenness centrality is maximized. We call this problem MAXIMUM BETWEENNESS CENTRALITY 
(MBC). 

Previous Results The shortest path betweenness centrality was introduced by Freeman [?]• Bran- 
des [2 E] and Newman [9] independently developed the same algorithm for computing the shortest 
path betweenness centrality of all nodes in 0{nm) time. 

Group betweenness centrality was introduced by Everett and Borgatti [5j. Puzis et al. [11] gave 
an algorithm for computing the group betweenness centrality of a given node set that runs in 0(n 3 ). 

Puzis et al. |12] introduced MBC, that is, the problem of finding a fc-element node set maximizing 
the group betweenness centrality. They showed that the problem is NP-hard. They also gave a 
greedy algorithm |12[ [TT] and showed that their algorithm yields an approximation factor of 1 — 1/e 
0]. We remark that Puzis et al. used the name KPP-Com instead of MBC. 

Our Contribution We provide a reduction from MBC to the well-known MAXIMUM COVERAGE 
problem which we define in Section [2] This reduction yields a much simpler proof of the approx- 
imability result of Dolev et al. [1]. Our reduction also allows us to derive a new algorithm for a 
budgeted version of the problem, which achieves the same approximation factor. One remarkable 
property of our reduction is that it is not a polynomial time reduction. Rather, the reduction is 
carried out implicitly and aims at analyzing the algorithms. 

We show that the analyses of these algorithms cannot be improved by providing tight examples 
(see Section [3]). We also prove that MBC is APX-complete thereby showing that MBC does not 
admit a PTAS (Section 

Finally, we develop an exact polynomial-time algorithm for MBC on tree graphs (see Section [5J . 

Problem Definition The input of MBC is an undirected and connected graph G = (V, E) with 
node costs c: V — > M.^ and a budget b. Let s,t 6 V be the two communicating nodes. By a s ,t we 
denote the number of shortest paths between s and t. For C C V let <7 Si t(C) be the number of 
shortest s-t paths containing at least one node of C. So C detects the communication of s and t 
with probability a S) t{C) / o s ^ since we assume that the communication path is selected uniformly at 
random among all shortest s-t paths. As the selection of any node pair as the communicating pair 
(s, t) is equally likely, the probability that C detects the communication is proportional to the sum 

GBC(C):= Y, 

s,t&V\s^t <Ts ' t 

which is called Group Betweenness Centrality. The MAXIMUM BETWEENNESS CENTRALITY prob- 
lem consists in finding a set C C V with c(C) < b such that the group betweenness centrality 
GBC(C) is maximized. 

2 Approximation Algorithms 

The Reduction Dolev et al. p[] prove the approximation factor of their algorithm by a technique 
inspired by a proof of the same factor for the greedy algorithm for the well-known MAXIMUM 
Coverage problem [6]. 
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In what follows we give a reduction to BUDGETED MAXIMUM COVERAGE |8] which is defined 
as follows. The input is a set S of ground elements with weight function w: S — > Rq~, a family J- 
of subsets of S, costs d : T — > and a budget b > 0. The goal is to find a collection C" C J with 
c'(C) < b such that the total weight w(C) of ground elements covered by C is maximized. 

The idea of our reduction is to model every shortest path of the graph G by a ground element with 
a corresponding weight. Every node v of G is modeled by the set of (ground elements corresponding 
to) shortest paths that contain v. 

Let (G = (V,E),c, b) be an instance of MBC. Let S(G) be the set of all shortest s-t paths 
between pairs s,t of distinct nodes. For a shortest s-t path P let w(P) := ]~/<J s ,t be its weight. 

For a node v let S(v) be the set of all shortest paths containing v. Set c'(S(v)) := c(w). Finally 
let J~(G) := { S(v) \ v £ V } be our family of sets. This completes the construction of our instance 
(S(G),w,F(G),c' ,b) of Budgeted Maximum Coverage. 

Let C C V be a set of nodes. Then /S'(C) := (J^ec denotes the set of all shortest paths 
containing at least one node of C. It is not hard to check that 

w(S(C)) = V — • a s>t (C) = GBC(C7) 

— t 

holds. Therefore, the group betweenness centrality of a set of nodes equals the weight of the 
corresponding set of shortest paths in the maximum coverage instance. Of course the feasible 
solutions of MBC and the feasible solutions of the reduced instance of MAXIMUM COVERAGE are in 
1-1-correspondence and have the same goal function value. Hence corresponding feasible solutions 
have also the same approximation ratio for the respective problem instances. We will exploit this 
fact to turn approximation algorithms for MAXIMUM COVERAGE into approximation algorithms for 
MBC with the same approximation ratio, respectively. We note, however, that the reduction is not 
polynomial. 

The Unit-Cost Version First we consider the unit cost variant of MBC, that is, c = 1, which 
has been introduced by Dolev et al. [12]. 

Consider an instance of unit-cost MBC. Then the reduction of the previous section yields an 
instance of unit-cost MAXIMUM COVERAGE. It is well-known that a natural greedy approach has 
an approximation factor of 1 — 1/e for unit-cost MAXIMUM COVERAGE [6]. The greedy algorithm 
works as follows: Start with an empty set C and then iteratively add to C the set S € J- that 
maximizes w(C + S'). 

Now let's turn back to MBC. Of course, we do not obtain an efficient algorithm if we apply 
the above greedy algorithm explicitly to the instance of MAXIMUM COVERAGE constructed by our 
reduction since this instance might be exponentially large. If we, however, translate the greedy 
approach for MAXIMUM COVERAGE back to MBC we arrive at the following algorithm: Start with 
an empty node set C and then iteratively add to C the node v that maximizes GBC(C-|-u). Observe 
that the greedy algorithm for MAXIMUM COVERAGE and the greedy algorithm for MBC produce 
feasible solutions that are corresponding to each other according to our reduction. Hence the latter 
algorithm has an approximation ratio of 1 — 1/e, too. 

An implementation of the greedy approach for MBC outlined before has been developed by Dolev 
et al. |12[ [TTJ 2]. The authors, however, carry out the analysis of its approximation performance 
from scratch inspired by the analysis of Feige ]6j for MAXIMUM COVERAGE. 

The crucial point in the implementation of Dolev et al. |12| [TT] is, given a node set C, how to 
determine a node v maximizing GBC(C + v). The main idea of their algorithm is to maintain a 
data structure that allows to obtain the value GBC(C + v) for any v G V in 0(1) time where C 



3 



is the set of nodes that the greedy algorithm has chosen so far. An update of their data structure 
takes 0(n 2 ) time if a node v is added to C. The total running time of all greedy steps is therefore 
0(kn 2 ). This running time is dominated by 0(n 3 ) time needed for a preprocessing step for the 
initialization of their data structure. 

The Budgeted Version The natural generalization of the greedy approach to BUDGETED MAX- 
IMUM Coverage would add in each greedy step a set S' that maximizes the relative gain (w(C + 
S') — w(C'))/c(S') among all sets that respect the budget bound, that is, c(C + S') < b. Here, C 
is the collection of sets already selected. 

As shown by Khuller et al. [8] this simple approach achieves an approximation factor of 1 — 1/y/e 
(~ 0.39) in the case of arbitrary costs. The authors, however, give a modified greedy algorithm with 
an approximation factor of 1 — 1/e (~ 0.63). The difference to the naive approach is not to start 
with an empty set C' but to try all initializations of C with at most three sets of T that respect 
the budget bound b. Each of these initializations is then augmented to a candidate solution using 
the above greedy steps. The algorithm chooses the best among the candidate solutions. 

By means of our reduction, we transform this algorithm into an algorithm for budgeted MBC 
that has the same approximation ratio (confer Algorithm [T]) . We start with every set of at most 
three nodes C C V not exceeding the budget and then enlarge this set using greedy steps. Given 
such a node set C, each greedy step selects the node v that maximizes the relative gain (GBC(C + 
v) — GBC(C))/c(u) among all nodes that respect the budget bound, that is, c(C+v) < b. Finally the 
algorithm chooses the best candidate solution found. Our reduction proves that the approximation 
performance of this algorithm is again 1 — 1/e. 



Algorithm 1: Greedy- Algorithm for MBC 
Input: G= (V,E),c,b 
H:=% 

foreach CCV with \C\ < 3 and c(C) < b do 
U :=V\C 
while U / do 

GBC(C+t>)-GBC(C*) 
u := argmax,, g[7 ^ ^ 

if c(C + u) < b then 

lC:=C + u 

[ U ■= U-u 
_ if GBC(C) > GBC(F) then H := C 
return H 



It remains to explain how a greedy step is implemented. As in the unit-cost case we can 
employ the data structure of Dolev et al. that allows to obtain the value GBC(C + v) in 
0(1) time. Since we know GBC(C) from the previous step, we can also compute the relative gain 
(GBC(C + v) - GBC(C))/c(u) for each node v £ V in constant time. 

As the update time of the data structure is 0(n 2 ) when the set C is augmented by a node v we 
get a running time of 0(n 3 ) for the augmentation stage for any fixed initialization of C. Since there 
are at most 0(n 3 ) initializations and the preprocessing of the data structure takes 0(n 3 ) time we 
obtain a total running time of 0(n 6 ). 

The simpler greedy approach (which only tests the initialization C = 0) can of course also be 
adopted for budgeted MBC. This algorithm runs in 0(n 3 ) time and has, as mentioned above, an 
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approximation factor of 1 — l/\/e (and 1 — 1/e in the case of unit costs). 

Theorem 1 There is an 0(n 3 )-time factor-(l — and an 0(n e )-time factor-(l — 1/e) approx- 



3 Tight Examples 

Feige [6] showed that even the unit-cost MAXIMUM COVERAGE problem is not approximable within 
an approximation factor better than 1 — 1/e thereby showing that the greedy algorithm is optimal 
in terms of the approximation ratio. This lower bound, however, does not carry over immediately 
to MBC because we have only a reduction from MBC to Maximum Coverage and not the other 
way round. 

In what follows we provide a class of tight examples and thus show that the analyses of both 
approximation algorithms considered in the previous section cannot be improved. Our examples 
are unit-cost instances that are tight even for our modified greedy algorithm and thus also for the 
greedy algorithm of Dolev et al. [12J. 

Tight Examples for MAXIMUM COVERAGE Our examples are derived from worst-case exam- 
ples of Khuller et al. [8] for unit-cost MAXIMUM COVERAGE. These examples use a (k + 3) x (k + 1) 
matrix (xij) with i = 1, . . . , k + 3 and j = 1, . . . , k + 1 where k is the number of sets to be selected. 
For each row and for each column there is a set in T that covers exactly the respective matrix 
entries. Only for column j = k + 1 there is no such set. 

By a suitable choice of the weights w{xij) Khuller et al. achieve that in an optimal solution only 
rows are selected. On the other hand, the greedy algorithm augments every initialization of three 
sets (rows or columns) by choosing only columns during the greedy steps. (The example exploits 
that the greedy algorithm may always choose columns in case of ties.) They show that the output 
produced this way has an approximation ratio arbitrarily close to 1 — 1/e for high values of k. 

Tight Examples for MBC We simulate this construction by an instance of MBC. We use that 
the weights w(xij) of matrix entries can be written as io(scy) = a.ij/k k where 



It should be clear that the example remains tight if we redefine w(xij) := a%j for any matrix entry 



For our instance of MBC we introduce two distinguished nodes s and t. For an illustration of 
our construction confer Figure [T] The basic idea is to represent every matrix entry x^ by exactly 
aij shortest s-t paths. Each row i is modeled by a node bi and each column j is modeled by a node 
dj. The set of shortest s-t paths meeting both a,- and bi is exactly the set of shortest s-t paths 
representing xy. 

For the sake of easier presentation we make some temporary assumptions. We explain later 
how those assumptions can be removed. First we suppose only paths from vertex s to vertex t 
contribute to the group betweenness centrality. Second, only the vertices a%, . . . , a,f~ and b%, . . . , &fc+3 
are candidates for the inclusion in a feasible solution C C V. Note that the node a^+i should not 
be a candidate. 

The shortest aj-bi paths can be created by a diamond like construction (Figure [T] shows this 
construction for aj / c+1 ). 



inflation algorithm for MAXIMUM BETWEENNESS CENTRALITY. 



□ 




1 < j < k 
j = k + l. 
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Figure 1: a) construction of the tight examples; the dotted lines represent the nodes a,j (j = 
1, . . . , k + 1) and bi (i = 1, . . . , k + 3) respectively whereas the dashed lines mark the a,j- h paths, 
b) construction of the ak+i~bi paths 

Recall that each node bi represents the row i and each node ctj represents column j. Given our 
preliminary assumptions, it is clear that in the above examples the feasible solutions for MAXIMUM 
COVERAGE and MBC are in 1-1 correspondence. Moreover, corresponding solutions have the same 
goal function value. Hence the modified greedy algorithm applied to the above instances produces 
corresponding solutions for MAXIMUM COVERAGE and MBC. It follows that the factor of 1 — 1/e 
is tight at least for the restricted version of MBC that meets our preliminary assumptions. 

Removing the Preliminary Assumptions First, we drop the assumption that only s-t paths 
are regarded. We extend our schematic construction so that the shortest paths between all pairs of 
vertices are considered but the matrix like construction still works. We do this by replacing s by a 
number l s of vertices Sj which are all directly linked with every ctj and all other Si>. Similarly, t is 
replaced by l t nodes tj directly linked with each bi and every other tj>. By increasing the numbers 
l s and It we achieve that only paths of the Si~tj type are relevant. This is because the number of 
pairs Si, tj is Q(l s lt) whereas the total number of remaining node pairs is 0{l s + It). 

Although we have achieved that only Si~tj paths have significant impact on the centrality of a 
solution C, we might face problems if the numbers of covered Sj-ij paths are equal for two feasible 
solutions. This is because we have assumed that the greedy algorithm chooses columns (or nodes 
OLj) in case of ties regarding only Si—tj paths. We can resolve this issue by making l s greater than 
It, this ensures that during the greedy steps always one of the cij nodes is preferred. 

The remaining problem is to ensure that only the nodes a± , . . . , and b\ , . . . , 6^+3 are allowed 
to be part of a solution. First we exclude a^+i as a candidate. This is accomplished by splitting 
a/c+i into multiple nodes, so that every b{ has its own node afc+i^. The node cik+ij is linked by an 
edge with each Sj and by a^+i^ paths with the node 6j. As all sy-ty paths covered by a^+i^ are 
also covered by bi we may assume that none of the nodes dk+i,i is used by a solution. Now consider 
a node u that lies on some shortest aj—bi path. It can be observed that cij covers any shortest sy-ty 
path that is covered by u. Therefore we may prefer aj over u. Finally consider a node Sj. Then the 
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centrality of Sj is 0(l s + It) whereas the centrality of any node a,- is Q(l s lt). It follows that only the 
nodes ai, . . . , a& and 61, ... , 6^+3 are relevant candidates for the inclusion in a good solution. 
As all preliminary assumptions can be removed, we get 

Theorem 2 The approximation factor of 1 — 1/ e of the greedy algorithm for MBC is tight. □ 

Our construction uses unit weights only. As the modified greedy algorithm starts the greedy 
procedure for every subset C C V of at most three vertices, its output cannot be worse than the 
output of the simpler greedy algorithm of Dolev et al. [12j. Hence the approximation factor of 
1 — 1/e of their algorithm is also tight. 

4 APX-completeness 

In this section we prove that unit-cost MBC is APX-complete thereby showing that it does not 
admit a PTAS on general graphs. 

We do this by giving an approximation preserving reduction from MAXIMUM VERTEX COVER. 
This problem is defined as follows. We are given an undirected graph G = (V, E) and a number k. 
We are looking for a fc-element node set V' such that the number of edges that are incident at some 
node in V is maximum. MAXIMUM VERTEX COVER is known to be APX-complete [ID] . 

Our proof consists of several steps. First we describe a polynomial time transformation of an 
instance (G, k) of MAXIMUM VERTEX COVER to an instance (G 1 , k) of MBC. Then we introduce 
a modified centrality measure GBC' for which it is easier to establish a correspondence between 
(approximate) solutions of MBC and MAXIMUM VERTEX COVER. We argue that it is sufficient to 
consider this modified measure instead of the betweenness centrality. Finally, we observe that for 
any (relevant) node set C its modified centrality GBC' in G' and the number of edges covered by 
C in G are proportional which completes the proof. 

The Transformation Given an instance (G, k) of Maximum Vertex Cover we construct a 
graph G' that contains all nodes of V and additionally for each v £ V a set v±, . . . ,vi of copies of 
v. Here, I is a large number to be chosen later. 

Now we specify the edge set of G' . First we connect for each v £ V the node set {v , v%, . . . , v{\ 
to a clique with I + 1 nodes. Let u, v be two distinct nodes in V . If u and v are adjacent in G then 
they are so in G'. If u and v are not adjacent in G then we introduce an intermediate node z uv and 
connect each u, and each Vj with z uv where i, j = 1, . . . , I. The number k represents the cardinality 
of the solution in both instances. This completes the construction of G'. 

Modified Centrality Any pair (ui,Vj) of copies of distinct nodes u, v G V is called essential. 
The remaining node pairs in G' are inessential. 

We are able to show that it suffices to work with the modified group betweenness centrality 

GBC(C) := au ^ {C) 

\Ui,Vj) is essential J 

that is, to respect only essential node pairs. The basic reason for this is that for any node set C 
the total contribution of inessential node pairs to the centrality measure GBC is linear in I. On 
the other hand, the contribution of essential pairs to reasonable solutions is always at least I 2 since 
the inclusion of at least one node u £ V into C already covers all I 2 shortest Ui-Vj paths for any 
v adjacent to u in G. Therefore we can make the impact of inessential pairs arbitrarily small by 
choosing / large enough. 
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Reduction from Maximum Vertex Cover to MBC Now we show that our above transfor- 
mation of G to G' can in fact be extended to an approximation preserving reduction from MAXIMUM 
Vertex Cover to the modified centrality problem. That is we have to specify how a feasible solu- 
tion for the latter problem can be transformed back into a solution for MAXIMUM VERTEX COVER 
that preserves the approximation ratio. 

To this end consider an arbitrary node set C of V. If C already covers all edges in G we are 
finished. Otherwise there is an edge (u, v) that is not covered by C. Now assume that C contains a 
copy u'i of some node v! G V. The only essential shortest paths that are occupied by v! i are 0{nl) 
shortest paths to copies v'- of nodes v' G V that are not adjacent to v! in G. Now suppose that we 
replace node v! i in C with node u of the uncovered edge (u, v). Then u covers at least I 2 previously 
uncovered shortest Ui~Vj paths between copies of u and v, respectively. Thus if / was chosen to be 
large in comparison to n the modified centrality can only increase under this replacement. 

If C contains an intermediate node z u / v r then this node covers exactly I 2 shortest u'f-v'j paths. 
Hence the modified centrality does not decrease if we replace z u ' v i with u. 

To summarize we have shown how we can transform any node set C in G' into a node set for 
G without decreasing the modified centrality. In other words we can restrict our view to node 
subsets of V. Now consider such a node set C that contains only nodes of V. It is easy to verify 
that C covers exactly all shortest ui—Vj paths of edges (u, v) in G for which at least one end point 
lies in C. In other words the modified centrality of C equals the number of edges covered by C 
multiplied with exactly I 2 . Hence the measures for MAXIMUM VERTEX COVER and the modified 
MBC are proportional. This completes the reduction from MAXIMUM VERTEX COVER to the 
modified centrality problem. 

Theorem 3 Unit-cost MBC is APX-complete. □ 

5 A Polynomial-Time Algorithm for Trees 

We complement the hardness result for general graphs of the previous section by a tractable special 
case. Specifically we show that the budgeted MBC problem can be solved efficiently on trees using 
a dynamic programming approach. 

Let T = (V, E) be a tree. We assume that T is rooted at some arbitrary node r. If v is a node 
in T then T v denotes the subtree of T hanging from v. 

Let s, t be an arbitrary pair of distinct nodes of the tree T. Since T contains exactly one s-t 
path, we have <j s j = 1- Let C C V be a set of nodes. Then a St t(C) = 1 if the s-t path contains 
some node from C, and otherwise a S) t{C) = 0. Thus the betweenness centrality GBC(C) of C 
simplifies greatly. It equals the number of s-t pairs (s and t always distinct) covered by C (meaning 

MC) = i). 

Our dynamic program uses a three-dimensional table B whose entries we now define. Let v be 
some node in T, let a < n 2 be a non-negative integer value, and let m < \T V \. Then B[v,a,m] 
denotes the cost of the cheapest node set CCT„ with the following two properties. 

(i) GBC 1 ,(C) > a where GBC„(C) denotes the number of s-t pairs in T v covered by C. 

(ii) There are at least m nodes u (including v) in T v such that the u-v path is not covered by C. 
We call such nodes top nodes of T v . 

In what follows we describe how those £>[-]-values can be computed in polynomial time in a 
bottom-up fashion. The optimum value of GBC in the input tree T then equals the maximum value 
a < n 2 such that B[r, a, 0] < b. We explain our algorithm for binary trees. The general case can 
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essentially be reduced to the case of binary trees by splitting any node with k > 3 children into 
k — 1 binary nodes. 

Consider a node v with children v\ and V2- We wish to compute B[va, m]. Assume by inductive 
hypothesis that we already know all values B[vi, -, ■] for i = 1,2. 

Suppose first that m > 1, which implies v ^ C. Let raj be the number of top nodes in T^. Then 
mi + m2 + 1 > m. Altogether there are a := (\T Vl \ + 1)(|T„ 2 | + 1) — 1 many s-t pairs such that s 
and t do not lie in the same subtree T Vi . It is exactly those pairs of T v nodes that have not yet been 
accounted for within the subtrees T Vi . Such a pair is not covered if and only if s and t are both top 
nodes of T v . There are (mi + l)(m,2 + 1) — 1 such pairs. Hence the number of covered node pairs s, t 
such that s and t do not lie in the same subtree T v . is given by a(mi,ni2) := & — (mi + l)(m2 + l) — 1. 
The value B[v,o~, m] is given by the minimum of the values B[v±, o~i, mi] + B[v2, o-2,m — mi — 1] 
such that 0"i + 0"2 + a(mi,m — mi) = a. Therefore B[v , a, m] can be computed in 0(ma) = 0(n 3 ) 
time. 

Now consider the case m = 0. If v ^ C then we can proceed as in the case m = 1. If v € C 
then any of the a pairs s,t with s and t not in the same subtree is covered by C. Hence, if 
v G C, then S[?j,0,cr] equals the minimum B of the values c(v) + i? [ui , o~i , 0] + £> [V2 , 02 , 0] such 
that o\ + 02 + a = a, which can be computed in 0(a) = 0(n 2 ) time. Altogether we have that 
B[v, a, 0] = mm{B, B[v, a, 1]}. 

Finally, if v is a leaf then B[v, 0, m] =0 for m = 0, 1. 

Since there are 0(n 4 ) values B[v,a, m] each of which can be computed in 0(n 3 ) we obtain 
a total running time of 0(n 7 ) for computing the optimum budgeted betweenness centrality on a 
binary tree. 

Theorem 4 The budgeted MBC problem can be solved in polynomial time on a tree. □ 

6 Concluding Remarks 

We have introduced a reduction from MBC to Maximum Coverage that allows us to simplify 
the analysis of the greedy approach of Dolev et al. [1] for the unit-cost version and to derive a 
new algorithm for a budgeted generalization of MBC. We have provided a class of tight examples 
for both algorithms. Moreover, we have shown that MBC is APX-complete but can be solved in 
polynomial time on trees. 

Our reduction suggests to consider MBC as a special case of MAXIMUM COVERAGE. It is 
well-known that MAXIMUM COVERAGE cannot be approximated strictly better than 1 — 1/e unless 
P = NP [6j. However, it seems to be difficult to derive a similar upper bound for MBC since the 
Maximum Coverage instances corresponding to MBC have a very specific structure. As there is 
at least one shortest path for any pair of nodes in a connected graph, the number \T\ of sets in the 
MAXIMUM COVERAGE instance is 0(-y/|S[) where S is the set of ground elements. 

On the other hand, the best known algorithm for MAXIMUM VERTEX COVER, developed by 
Ageev and Sviridenko p], has a ratio of 3/4. Our approximation preserving reduction from MAXI- 
MUM Vertex Cover to MBC provided in Section [4] shows that a significantly better approxima- 
bility result for MBC would also imply a better approximation for MAXIMUM VERTEX COVER. 
Conversely, this reduction suggests to try the techniques of Ageev and Sviridenko |1] as possible 
avenues to improve the approximation factor for MBC. 
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Appendix 



A Justification of the Modified Betweenness Centrality 

Recall that we used in the proof of the APX-completeness in Section [4] the modified centrality 

GBC'(C) := Yl au ^ iC) 

(ui,Vj) is essential *' J 

instead of GBC. In order to justify this more formally, we give an approximation preserving reduc- 
tion from the modified problem version to MBC. 

Let OPT and OPT' denote the optimum centrality for the problem instance G' (for the construc- 
tion of G' confer Section^ with respect to GBC and GBC', respectively. Consider a fe-element node 
set C such that GBC(C) > (1 - e) OPT. We claim that GBC'(C) > (1 - 2e) OPT' if I was chosen 
large enough. This completes the reduction from the modified problem version to the original one. 

The claim can be seen as follows: The first type of inessential node pairs form pairs (vi,Vj) of 
copies of the same node v £ V. The only shortest path between Uj and vj is the direct connection. 
Hence any node in C occupies at most I — 1 of such paths. This implies that the centrality of C 
drops by at most 0(nl) when we ignore inessential node pairs of the first type. 

The second type of inessential node pairs form pairs (z, z') where at least one of the nodes z and 
z' is not a copy of a node in G. In other words, this node is either a node in G or an intermediate 
node z uv for some edge (u, v) in G. Since there are only 0(m 2 l) inessential pairs of this type the 
absolute error we make when switching to the modified betweenness centrality is bounded by cm 2 l 
for some constant c, that is, GBC'(C) > GBC(C) — cm 2 l. 

Let (u, v) be some edge in G. We can cover at least all I 2 shortest Ui~Vj paths in G' by including 
u into our solution C. This implies OPT' > I 2 . By choosing I > (cm 2 )/e we can ensure that our 
solution C has a modified centrality GBC'(C) of at least (1 - e) OPT -cm 2 / > (1 - 2e) OPT' as 
desired. 



B Polynomial Time Algorithm for Trees of Arbitrary Degree 

In Section [5] we have provided a polynomial time algorithm for solving MBC on binary trees. 

As we remarked the case of arbitrary trees can essentially be reduced to the case of a binary 
tree. To this end consider a node v with children v±, . . . , Vk- 

The case k = 1 can be handled similarly to k = 2 and is in fact easier. If m > 1 then B[v,a, m] 
equals B[vi,a,m — 1]. If m = then B[v,a, 0] is the minimum of B[v, a, 1] and c(v) + B[v\,a — 
\T V1 \,0}. 

If k > 3 we face the problem that there are possibly exponentially many ways of distributing 
the m top nodes to the subtrees T Vi . To overcome this difficulty we split v into k—1 binary nodes. 
More precisely, we introduce a set U(v) of k — 1 new nodes u\, . . . , u^-i and replace v and the edges 
incident at v with the edge set { (ui,Vi), (uj, itj+i) | i = 1, . . . , k — 1 }. Here we set Uk = Vk- The 
cost c(ufc_i) is set to c(v) the remaining costs c[v,i) are zero. 

Now we can treat these newly introduced nodes very similarly to the binary nodes of the original 
tree. The difference is that we need to handle the nodes in U (v) as a single top node and as a single 
end node of paths. Moreover, we have to ensure that either all of the nodes in U[v) are included 
in C or none of them. (One can picture the u\-Uk-i path as an expanded version of the originally 
single node v.) 
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To this end we handle u^-i like a regular binary node as described above. Now consider m 
with i < k — 2 having children v% and Uj+i. If m > 1 and hence u% ^ C then B[ui,a,m] equals the 
minimum value B[vi, mi, ci] + i?[uj + i, m2, a 2 ] such that mi + m 2 = m, m 2 > 1 and ci + 02 + |T^| • 
(\T Ui+1 — U{v) \ + 1) — m\m 2 = cr. We require that 777,2 > 1 since we have to ensure that either all 
of the nodes in U (v) are included in C or none of them. 

Now consider the case m = 0. For i = 1, . . . , k — 2 let B{ be the minimum value B[vi, a\, 0] + 
.B[ttj_i_i, (72, 0] such that o\ + 02 + |T„J(|T Ui+1 — f7(w)| + 1) = cr. We have to ensure that only B[-]- 
values are combined in which the inclusion of v in a central node set C (i.e. m = 0) is assumed 
either for all Ui {i = 1, . . . , k — 1) or for none. Therefore, the only node for which we include the 
case m > 1 in the case m = is u\ (remember that m is only a lower bound for the number of top 
nodes). Thus B[u\, a, 0] equals min {B\, B[u\, a, 1]}. For 2 < i < k — 2 we get B[ui, a, 0] = B{. We 
also have to ensure that for u^-i the cost B[uk^i,a, 1] is not considered during the computation of 
B[uk-i, cr, 0] which leads to B[uk-i, 0,0] = B where, as in Section [5] B equals the minimum of the 
values c(v) + B[vk-i, a\, 0] + B[vk, cr 2 , 0] such that a\+a 2 + (|T t , k _ 1 | + l)(|T„ fc | + 1) - 1 = a. All of 
the above computations can be carried out in 0(n 3 ) per value B [m , a, m] . 

Finally, we observe that the number of nodes can at most double by the above splitting con- 
struction. Which yields Theorem [4] 
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